93 research outputs found

    AdaCompress: Adaptive Compression for Online Computer Vision Services

    Full text link
    With the growth of computer vision based applications and services, an explosive amount of images have been uploaded to cloud servers which host such computer vision algorithms, usually in the form of deep learning models. JPEG has been used as the {\em de facto} compression and encapsulation method before one uploads the images, due to its wide adaptation. However, standard JPEG configuration does not always perform well for compressing images that are to be processed by a deep learning model, e.g., the standard quality level of JPEG leads to 50\% of size overhead (compared with the best quality level selection) on ImageNet under the same inference accuracy in popular computer vision models including InceptionNet, ResNet, etc. Knowing this, designing a better JPEG configuration for online computer vision services is still extremely challenging: 1) Cloud-based computer vision models are usually a black box to end-users; thus it is difficult to design JPEG configuration without knowing their model structures. 2) JPEG configuration has to change when different users use it. In this paper, we propose a reinforcement learning based JPEG configuration framework. In particular, we design an agent that adaptively chooses the compression level according to the input image's features and backend deep learning models. Then we train the agent in a reinforcement learning way to adapt it for different deep learning cloud services that act as the {\em interactive training environment} and feeding a reward with comprehensive consideration of accuracy and data size. In our real-world evaluation on Amazon Rekognition, Face++ and Baidu Vision, our approach can reduce the size of images by 1/2 -- 1/3 while the overall classification accuracy only decreases slightly.Comment: ACM Multimedi

    SCPAT-GAN: Structural Constrained and Pathology Aware Convolutional Transformer-GAN for Virtual Histology Staining of Human Coronary OCT images

    Full text link
    There is a significant need for the generation of virtual histological information from coronary optical coherence tomography (OCT) images to better guide the treatment of coronary artery disease. However, existing methods either require a large pixel-wisely paired training dataset or have limited capability to map pathological regions. To address these issues, we proposed a structural constrained, pathology aware, transformer generative adversarial network, namely SCPAT-GAN, to generate virtual stained H&E histology from OCT images. The proposed SCPAT-GAN advances existing methods via a novel design to impose pathological guidance on structural layers using transformer-based network.Comment: 9 pages, 4 figure

    Frequency-aware optical coherence tomography image super-resolution via conditional generative adversarial neural network

    Full text link
    Optical coherence tomography (OCT) has stimulated a wide range of medical image-based diagnosis and treatment in fields such as cardiology and ophthalmology. Such applications can be further facilitated by deep learning-based super-resolution technology, which improves the capability of resolving morphological structures. However, existing deep learning-based method only focuses on spatial distribution and disregard frequency fidelity in image reconstruction, leading to a frequency bias. To overcome this limitation, we propose a frequency-aware super-resolution framework that integrates three critical frequency-based modules (i.e., frequency transformation, frequency skip connection, and frequency alignment) and frequency-based loss function into a conditional generative adversarial network (cGAN). We conducted a large-scale quantitative study from an existing coronary OCT dataset to demonstrate the superiority of our proposed framework over existing deep learning frameworks. In addition, we confirmed the generalizability of our framework by applying it to fish corneal images and rat retinal images, demonstrating its capability to super-resolve morphological details in eye imaging.Comment: 13 pages, 7 figures, submitted to Biomedical Optics Express special issu

    Push the Boundary of SAM: A Pseudo-label Correction Framework for Medical Segmentation

    Full text link
    Segment anything model (SAM) has emerged as the leading approach for zero-shot learning in segmentation, offering the advantage of avoiding pixel-wise annotation. It is particularly appealing in medical image segmentation where annotation is laborious and expertise-demanding. However, the direct application of SAM often yields inferior results compared to conventional fully supervised segmentation networks. While using SAM generated pseudo label could also benefit the training of fully supervised segmentation, the performance is limited by the quality of pseudo labels. In this paper, we propose a novel label corruption to push the boundary of SAM-based segmentation. Our model utilizes a novel noise detection module to distinguish between noisy labels from clean labels. This enables us to correct the noisy labels using an uncertainty-based self-correction module, thereby enriching the clean training set. Finally, we retrain the network with updated labels to optimize its weights for future predictions. One key advantage of our model is its ability to train deep networks using SAM-generated pseudo labels without relying on a subset of expert-level annotations. We demonstrate the effectiveness of our proposed model on both X-ray and lung CT datasets, indicating its ability to improve segmentation accuracy and outperform baseline methods in label correction

    Deep Learning based 3D Segmentation: A Survey

    Full text link
    3D object segmentation is a fundamental and challenging problem in computer vision with applications in autonomous driving, robotics, augmented reality and medical image analysis. It has received significant attention from the computer vision, graphics and machine learning communities. Traditionally, 3D segmentation was performed with hand-crafted features and engineered methods which failed to achieve acceptable accuracy and could not generalize to large-scale data. Driven by their great success in 2D computer vision, deep learning techniques have recently become the tool of choice for 3D segmentation tasks as well. This has led to an influx of a large number of methods in the literature that have been evaluated on different benchmark datasets. This paper provides a comprehensive survey of recent progress in deep learning based 3D segmentation covering over 150 papers. It summarizes the most commonly used pipelines, discusses their highlights and shortcomings, and analyzes the competitive results of these segmentation methods. Based on the analysis, it also provides promising research directions for the future.Comment: Under review of ACM Computing Surveys, 36 pages, 10 tables, 9 figure

    First Place Solution to the CVPR'2023 AQTC Challenge: A Function-Interaction Centric Approach with Spatiotemporal Visual-Language Alignment

    Full text link
    Affordance-Centric Question-driven Task Completion (AQTC) has been proposed to acquire knowledge from videos to furnish users with comprehensive and systematic instructions. However, existing methods have hitherto neglected the necessity of aligning spatiotemporal visual and linguistic signals, as well as the crucial interactional information between humans and objects. To tackle these limitations, we propose to combine large-scale pre-trained vision-language and video-language models, which serve to contribute stable and reliable multimodal data and facilitate effective spatiotemporal visual-textual alignment. Additionally, a novel hand-object-interaction (HOI) aggregation module is proposed which aids in capturing human-object interaction information, thereby further augmenting the capacity to understand the presented scenario. Our method achieved first place in the CVPR'2023 AQTC Challenge, with a Recall@1 score of 78.7\%. The code is available at https://github.com/tomchen-ctj/CVPR23-LOVEU-AQTC.Comment: Winner of CVPR2023 Long-form Video Understanding and Generation Challenge (Track 3

    The role of serum vitamin D in patients with normal ovarian reserve undergoing the first IVF/ICSI cycle

    Get PDF
    BackgroundThe debate over the impact of vitamin D in assisted reproduction continues. The purpose of our study was to assess embryo quality and pregnancy outcomes among groups with different levels of vitamin D after the first in vitro fertilization (IVF)/intracytoplasmic sperm injection (ICSI) cycle in patients with normal ovarian reserve (NOR).MethodsPatients in this retrospective cohort study were divided into three groups: severe vitamin D deficiency group (25OH-D < 10 ng/ml), vitamin D deficiency group (10 ng/ml ≤ 25OH-D < 20 ng/ml), and non-vitamin D deficiency group (25OH-D ≥ 20 ng/ml). The primary outcome was clinical pregnancy, while the secondary outcomes were mature oocytes, oocyte fertilization, available cleavage embryos, available blastocysts, biochemical pregnancy, early abortion, and embryo implantation. A modified Poisson regression model and multiple linear regression analysis were conducted for the multivariate analysis.Results264 NOR patients undergoing the first IVF/ICSI cycles were included. For the primary outcome, there was no significant difference in clinical pregnancy between the severe vitamin D deficiency group and the other two groups (vitamin D deficiency group: adjusted RR = 1.026; 0.780 - 1.350; P = 0.854; non-vitamin D deficiency group: adjusted RR = 1.092; 0.743 - 1.605; P = 0.652). For all secondary outcomes, no significant differences were observed among the severe vitamin D deficiency, vitamin D deficiency, and non-vitamin D deficiency groups (P > 0.05). Exploratory subgroup analyses concerning the season of embryo transfer, phase of embryo transferred, and endometrial thickness, as well as the sensitivity analysis using logistic regression models for the primary outcome, revealed comparable clinical pregnancy rates among the groups (P > 0.05). Subgroup analysis concerning ovarian stimulation protocol indicated that in the subgroup of gonadotrophin-releasing hormone (GnRH) antagonist protocol, the clinical pregnancy rate of the non-vitamin D deficiency group was significantly higher than that of the other two groups (P < 0.05).ConclusionSerum vitamin D level was not associated with embryo quality and pregnancy outcomes for patients with NOR. Further studies with greater sample sizes and a longer follow-up period are needed to elucidate the relationships between vitamin D levels and IVF outcomes
    corecore